bayesian mixture
Dynamic Knowledge Injection for AIXI Agents
Yang-Zhao, Samuel, Ng, Kee Siong, Hutter, Marcus
Prior approximations of AIXI, a Bayesian optimality notion for general reinforcement learning, can only approximate AIXI's Bayesian environment model using an a-priori defined set of models. This is a fundamental source of epistemic uncertainty for the agent in settings where the existence of systematic bias in the predefined model class cannot be resolved by simply collecting more data from the environment. We address this issue in the context of Human-AI teaming by considering a setup where additional knowledge for the agent in the form of new candidate models arrives from a human operator in an online fashion. We introduce a new agent called DynamicHedgeAIXI that maintains an exact Bayesian mixture over dynamically changing sets of models via a time-adaptive prior constructed from a variant of the Hedge algorithm. The DynamicHedgeAIXI agent is the richest direct approximation of AIXI known to date and comes with good performance guarantees. Experimental results on epidemic control on contact networks validates the agent's practical utility.
A Circuit Complexity Formulation of Algorithmic Information Theory
Inspired by Solomonoffs theory of inductive inference, we propose a prior based on circuit complexity. There are several advantages to this approach. First, it relies on a complexity measure that does not depend on the choice of UTM. There is one universal definition for Boolean circuits involving an universal operation such as nand with simple conversions to alternative definitions such as and, or, and not. Second, there is no analogue of the halting problem. The output value of a circuit can be calculated recursively by computer in time proportional to the number of gates, while a short program may run for a very long time. Our prior assumes that a Boolean function, or equivalently, Boolean string of fixed length, is generated by some Bayesian mixture of circuits. This model is appropriate for learning Boolean functions from partial information, a problem often encountered within machine learning as "binary classification." We argue that an inductive bias towards simple explanations as measured by circuit complexity is appropriate for this problem.
Variational Inference for Bayesian Mixtures of Factor Analysers
We present an algorithm that infers the model structure of a mix(cid:173) ture of factor analysers using an efficient and deterministic varia(cid:173) tional approximation to full Bayesian integration over model pa(cid:173) rameters. This procedure can automatically determine the opti(cid:173) mal number of components and the local dimensionality of each component (Le. the number of factors in each factor analyser) . Alternatively it can be used to infer posterior distributions over number of components and dimensionalities. Since all parameters are integrated out the method is not prone to overfitting. Using a stochastic procedure for adding components it is possible to per(cid:173) form the variational optimisation incrementally and to avoid local maxima.
Conditional Visual Tracking in Kernel Space
We consider the problem of inferring 3D articulated human motion from monocular video. This research topic has applications for scene understanding including human-computer in- terfaces, markerless human motion capture, entertainment and surveillance. A monocular approach is relevant because in real-world settings the human body parts are rarely com- pletely observed even when using multiple cameras. This is due to occlusions form other people or objects in the scene. A robust system has to necessarily deal with incomplete, ambiguous and uncertain measurements. Methods for 3D human motion reconstruction can be classified as generative and discriminative. They both require a state representation, namely a 3D human model with kinematics (joint angles) or shape (surfaces or joint po- sitions) and they both use a set of image features as observations for state inference.
Conditional Visual Tracking in Kernel Space
Sminchisescu, Cristian, Kanujia, Atul, Li, Zhiguo, Metaxas, Dimitris
We present a conditional temporal probabilistic framework for reconstructing 3D human motion in monocular video based on descriptors encoding image silhouette observations. For computational efficiency we restrict visual inference to low-dimensional kernel induced nonlinear state spaces. Our methodology (kBME) combines kernel PCA-based nonlinear dimensionality reduction (kPCA) and Conditional Bayesian Mixture of Experts (BME) in order to learn complex multivalued predictors between observations and model hidden states. This is necessary for accurate, inverse, visual perception inferences, where several probable, distant 3D solutions exist due to noise or the uncertainty of monocular perspective projection. Low-dimensional models are appropriate because many visual processes exhibit strong nonlinear correlations in both the image observations and the target, hidden state variables. The learned predictors are temporally combined within a conditional graphical model in order to allow a principled propagation of uncertainty. We study several predictors and empirically show that the proposed algorithm positively compares with techniques based on regression, Kernel Dependency Estimation (KDE) or PCA alone, and gives results competitive to those of high-dimensional mixture predictors at a fraction of their computational cost. We show that the method successfully reconstructs the complex 3D motion of humans in real monocular video sequences.
Conditional Visual Tracking in Kernel Space
Sminchisescu, Cristian, Kanujia, Atul, Li, Zhiguo, Metaxas, Dimitris
We present a conditional temporal probabilistic framework for reconstructing 3D human motion in monocular video based on descriptors encoding image silhouette observations. For computational efficiency we restrict visual inference to low-dimensional kernel induced nonlinear state spaces. Our methodology (kBME) combines kernel PCA-based nonlinear dimensionality reduction (kPCA) and Conditional Bayesian Mixture of Experts (BME) in order to learn complex multivalued predictors between observations and model hidden states. This is necessary for accurate, inverse, visual perception inferences, where several probable, distant 3D solutions exist due to noise or the uncertainty of monocular perspective projection. Low-dimensional models are appropriate because many visual processes exhibit strong nonlinear correlations in both the image observations and the target, hidden state variables. The learned predictors are temporally combined within a conditional graphical model in order to allow a principled propagation of uncertainty. We study several predictors and empirically show that the proposed algorithm positively compares with techniques based on regression, Kernel Dependency Estimation (KDE) or PCA alone, and gives results competitive to those of high-dimensional mixture predictors at a fraction of their computational cost. We show that the method successfully reconstructs the complex 3D motion of humans in real monocular video sequences.
Conditional Visual Tracking in Kernel Space
Sminchisescu, Cristian, Kanujia, Atul, Li, Zhiguo, Metaxas, Dimitris
We present a conditional temporal probabilistic framework for reconstructing 3Dhuman motion in monocular video based on descriptors encoding image silhouette observations. For computational efficiency we restrict visual inference to low-dimensional kernel induced nonlinear state spaces. Our methodology (kBME) combines kernel PCA-based nonlinear dimensionality reduction (kPCA) and Conditional Bayesian Mixture of Experts (BME) in order to learn complex multivalued predictors betweenobservations and model hidden states. This is necessary for accurate, inverse, visual perception inferences, where several probable, distant3D solutions exist due to noise or the uncertainty of monocular perspectiveprojection. Low-dimensional models are appropriate because many visual processes exhibit strong nonlinear correlations in both the image observations and the target, hidden state variables. The learned predictors are temporally combined within a conditional graphical modelin order to allow a principled propagation of uncertainty. We study several predictors and empirically show that the proposed algorithm positivelycompares with techniques based on regression, Kernel Dependency Estimation (KDE) or PCA alone, and gives results competitive tothose of high-dimensional mixture predictors at a fraction of their computational cost. We show that the method successfully reconstructs the complex 3D motion of humans in real monocular video sequences.
Covariance Kernels from Bayesian Generative Models
We propose the framework of mutual information kernels for learning covariance kernels, as used in Support Vector machines and Gaussian process classifiers, from unlabeled task data using Bayesian techniques. We describe an implementation of this framework which uses variational Bayesian mixtures of factor analyzers in order to attack classification problems in high-dimensional spaces where labeled data is sparse, but unlabeled data is abundant.
Covariance Kernels from Bayesian Generative Models
We propose the framework of mutual information kernels for learning covariance kernels, as used in Support Vector machines and Gaussian process classifiers, from unlabeled task data using Bayesian techniques. We describe an implementation of this framework which uses variational Bayesian mixtures of factor analyzers in order to attack classification problems in high-dimensional spaces where labeled data is sparse, but unlabeled data is abundant.
Covariance Kernels from Bayesian Generative Models
We propose the framework of mutual information kernels for learning covariance kernels, as used in Support Vector machines and Gaussian process classifiers, from unlabeled task data using Bayesian techniques. We describe an implementation of this framework whichuses variational Bayesian mixtures of factor analyzers in order to attack classification problems in high-dimensional spaces where labeled data is sparse, but unlabeled data is abundant.